315 research outputs found

    Performance and power optimizations in chip multiprocessors for throughput-aware computation

    Get PDF
    The so-called "power (or power density) wall" has caused core frequency (and single-thread performance) to slow down, giving rise to the era of multi-core/multi-thread processors. For example, the IBM POWER4 processor, released in 2001, incorporated two single-thread cores into the same chip. In 2010, IBM released the POWER7 processor with eight 4-thread cores in the same chip, for a total capacity of 32 execution contexts. The ever increasing number of cores and threads gives rise to new opportunities and challenges for software and hardware architects. At software level, applications can benefit from the abundant number of execution contexts to boost throughput. But this challenges programmers to create highly-parallel applications and operating systems capable of scheduling them correctly. At hardware level, the increasing core and thread count puts pressure on the memory interface, because memory bandwidth grows at a slower pace ---phenomenon known as the "bandwidth (or memory) wall". In addition to memory bandwidth issues, chip power consumption rises due to manufacturers' difficulty to lower operating voltages sufficiently every processor generation. This thesis presents innovations to improve bandwidth and power consumption in chip multiprocessors (CMPs) for throughput-aware computation: a bandwidth-optimized last-level cache (LLC), a bandwidth-optimized vector register file, and a power/performance-aware thread placement heuristic. In contrast to state-of-the-art LLC designs, our organization avoids data replication and, hence, does not require keeping data coherent. Instead, the address space is statically distributed all over the LLC (in a fine-grained interleaving fashion). The absence of data replication increases the cache effective capacity, which results in better hit rates and higher bandwidth compared to a coherent LLC. We use double buffering to hide the extra access latency due to the lack of data replication. The proposed vector register file is composed of thousands of registers and organized as an aggregation of banks. We leverage such organization to attach small special-function "local computation elements" (LCEs) to each bank. This approach ---referred to as the "processor-in-regfile" (PIR) strategy--- overcomes the limited number of register file ports. Because each LCE is a SIMD computation element and all of them can proceed concurrently, the PIR strategy constitutes a highly-parallel super-wide-SIMD device (ideal for throughput-aware computation). Finally, we present a heuristic to reduce chip power consumption by dynamically placing software (application) threads across hardware (physical) threads. The heuristic gathers chip-level power and performance information at runtime to infer characteristics of the applications being executed. For example, if an application's threads share data, the heuristic may decide to place them in fewer cores to favor inter-thread data sharing and communication. In such case, the number of active cores decreases, which is a good opportunity to switch off the unused cores to save power. It is increasingly harder to find bulletproof (micro-)architectural solutions for the bandwidth and power scalability limitations in CMPs. Consequently, we think that architects should attack those problems from different flanks simultaneously, with complementary innovations. This thesis contributes with a battery of solutions to alleviate those problems in the context of throughput-aware computation: 1) proposing a bandwidth-optimized LLC; 2) proposing a bandwidth-optimized register file organization; and 3) proposing a simple technique to improve power-performance efficiency.El excesivo consumo de potencia de los procesadores actuales ha desacelerado el incremento en la frecuencia operativa de los mismos para dar lugar a la era de los procesadores con múltiples núcleos y múltiples hilos de ejecución. Por ejemplo, el procesador POWER7 de IBM, lanzado al mercado en 2010, incorpora ocho núcleos en el mismo chip, con cuatro hilos de ejecución por núcleo. Esto da lugar a nuevas oportunidades y desafíos para los arquitectos de software y hardware. A nivel de software, las aplicaciones pueden beneficiarse del abundante número de núcleos e hilos de ejecución para aumentar el rendimiento. Pero esto obliga a los programadores a crear aplicaciones altamente paralelas y sistemas operativos capaces de planificar correctamente la ejecución de las mismas. A nivel de hardware, el creciente número de núcleos e hilos de ejecución ejerce presión sobre la interfaz de memoria, ya que el ancho de banda de memoria crece a un ritmo más lento. Además de los problemas de ancho de banda de memoria, el consumo de energía del chip se eleva debido a la dificultad de los fabricantes para reducir suficientemente los voltajes de operación entre generaciones de procesadores. Esta tesis presenta innovaciones para mejorar el ancho de banda y consumo de energía en procesadores multinúcleo en el ámbito de la computación orientada a rendimiento ("throughput-aware computation"): una memoria caché de último nivel ("last-level cache" o LLC) optimizada para ancho de banda, un banco de registros vectorial optimizado para ancho de banda, y una heurística para planificar la ejecución de aplicaciones paralelas orientada a mejorar la eficiencia del consumo de potencia y desempeño. En contraste con los diseños de LLC de última generación, nuestra organización evita la duplicación de datos y, por tanto, no requiere de técnicas de coherencia. El espacio de direcciones de memoria se distribuye estáticamente en la LLC con un entrelazado de grano fino. La ausencia de replicación de datos aumenta la capacidad efectiva de la memoria caché, lo que se traduce en mejores tasas de acierto y mayor ancho de banda en comparación con una LLC coherente. Utilizamos la técnica de "doble buffering" para ocultar la latencia adicional necesaria para acceder a datos remotos. El banco de registros vectorial propuesto se compone de miles de registros y se organiza como una agregación de bancos. Incorporamos a cada banco una pequeña unidad de cómputo de propósito especial ("local computation element" o LCE). Este enfoque ---que llamamos "computación en banco de registros"--- permite superar el número limitado de puertos en el banco de registros. Debido a que cada LCE es una unidad de cómputo con soporte SIMD ("single instruction, multiple data") y todas ellas pueden proceder de forma concurrente, la estrategia de "computación en banco de registros" constituye un dispositivo SIMD altamente paralelo. Por último, presentamos una heurística para planificar la ejecución de aplicaciones paralelas orientada a reducir el consumo de energía del chip, colocando dinámicamente los hilos de ejecución a nivel de software entre los hilos de ejecución a nivel de hardware. La heurística obtiene, en tiempo de ejecución, información de consumo de potencia y desempeño del chip para inferir las características de las aplicaciones. Por ejemplo, si los hilos de ejecución a nivel de software comparten datos significativamente, la heurística puede decidir colocarlos en un menor número de núcleos para favorecer el intercambio de datos entre ellos. En tal caso, los núcleos no utilizados se pueden apagar para ahorrar energía. Cada vez es más difícil encontrar soluciones de arquitectura "a prueba de balas" para resolver las limitaciones de escalabilidad de los procesadores actuales. En consecuencia, creemos que los arquitectos deben atacar dichos problemas desde diferentes flancos simultáneamente, con innovaciones complementarias

    Interaction patterns for smart spaces: a confident interaction design solution for pervasive sensitive IoT services

    Get PDF
    Smart spaces represent a powerful tool for deploying the new pervasive sensitive services based on Internet of Things products and developed in current Information Society close to users. Researchers have focused their efforts on new techniques to improve systems and products in this area but neglecting the human factors related to psychological aspects of the user and their psycho-social relationship with the deployment space where they live. This research proposes to take into account these cognitive features in early stages of the design of smart spaces by defining a set of interaction patterns. By using this set of interaction patterns it is possible to influence over the confidence that users can develop during the use of IoT products and services based on them. An evaluative verification has been carried out to assess how this design engineering approach provide a real impact on the generation of confidence in the users of this kind of technology

    Performance Characterization of State-Of-The-Art Deep Learning Workloads on an IBM Minsky Platform

    Get PDF
    Deep learning algorithms are known to demand significant computing horsepower, in particular when it comes to training these models. The capability of developing new algorithms and improving the existing ones is in part determined by the speed at which these models can be trained and tested. One alternative to attain significant performance gains is through hardware acceleration. However, deep learning has evolved into a large variety of models, including but not limited to fully-connected, convolutional, recurrent and memory networks. Therefore, it appears difficult that a single solution can provide effective acceleration for this entire deep learning ecosystem. This work presents detailed characterization results of a set of archetypal state-of-the-art deep learning workloads on a last-generation IBM POWER8 system with NVIDIA Tesla P100 GPUs and NVLink interconnects. The goal is to identify the performance bottlenecks (i.e. the accelerable portions) to provide a thorough study that can guide the design of prospective acceleration platforms in a more effective manner. In addition, we analyze the role of the GPU (as one particular type of acceleration engine) and its effectiveness as a function of the size of the problem

    Landslide Risk: Economic Valuation in the North-Eastern Zone of Medellin City

    Get PDF
    Natural disasters of a geodynamic nature can cause enormous economic and human losses. The economic costs of a landslide disaster include relocation of communities and physical repair of urban infrastructure. However, when performing a quantitative risk analysis, generally, the indirect economic consequences of such an event are not taken into account. A probabilistic approach methodology that considers several scenarios of hazard and vulnerability to measure the magnitude of the landslide and to quantify the economic costs is proposed. With this approach, it is possible to carry out a quantitative evaluation of the risk by landslides, allowing the calculation of the economic losses before a potential disaster in an objective, standardized and reproducible way, taking into account the uncertainty of the building costs in the study zone. The possibility of comparing different scenarios facilitates the urban planning process, the optimization of interventions to reduce risk to acceptable levels and an assessment of economic losses according to the magnitude of the damage. For the development and explanation of the proposed methodology, a simple case study is presented, located in north-eastern zone of the city of Medellín. This area has particular geomorphological characteristics, and it is also characterized by the presence of several buildings in bad structural conditions. The proposed methodology permits to obtain an estimative of the probable economic losses by earthquake-induced landslides, taking into account the uncertainty of the building costs in the study zone. The obtained estimative shows that the structural intervention of the buildings produces a reduction the order of 21 % in the total landslide risk. © Published under licence by IOP Publishing Ltd

    Environmental health : reflexions of the Brazilian Association of Post-Graduation in Collective Health - ABRASCO

    Get PDF
    O Brasil, apesar de sua extraordinária biodiversidade e do enorme potencial instalado para desenvolver ações integradas na temática do ambiente, não tem dado, do ponto de vista programático, a prioridade que o tema ambiente merece. A Associação Brasileira de Pós-Graduação em Saúde Coletiva-ABRASCO reconheceu a importância de organizar um Grupo Temático “Saúde e Ambiente” para, de maneira mais organizada, participar da luta pelo desenvolvimento sustentável, através da ação política no campo da saúde coletiva, em busca de ambientes saudáveis e da promoção da saúde. O objetivo principal deste Grupo Temático-GT foi contribuir para que o tema da saúde ambiental seja internalizado no campo da Saúde Coletiva. Método: O Grupo escolheu três eixos para discussão em uma oficina do V Congresso Brasileiro de Epidemiologia, em Curitiba, no ano de 2002. O resultado resultado do debate ocorrido foi apresentado segundo três eixos: identificação do campo teórico-conceitual em Saúde Ambiente; a política de saúde e ambiente; o caminho metodológico. A conclusão foi apresentada no formato de uma agenda do GT para o biênio 2002-2004. _______________________________________________________________________________________ ABSTRACTNot with standing its extraordinary biodiversity and enormous installed potential to develop actions integrated in to the topic of environmental health, Brazil has not given the environmental the priority the subject deserves from the programmatic, point of view. The Brazilian Association of Post-Graduation in Collective Health - ABRASCO recognized the importance of organizing a Thematic Group “Health and Environment” in order to, in a more organized fashion, participate in the struggle for sustainable development, through political action in collective health, oriented to wards health environments and health promotion. The main objective of this Thematic Group was to facilitate the subject of the environmental health to be internalized in to the field of the Collective Health. Method: The theme was debated in a workshop during the V Brazilian Congress of Epidemiology in Curitiba, in 2002. Results were presented according to three axes: 1- Identification of the theoretical-conceptual field in Environmental Health; 2- the politics of health and environment; 3- methodological route. The general conclusion was presented as an agenda for ABRASCO’s Thematic Group in Environmental Health for the biennial 2002-2004

    Permanent Genetic Resources added to Molecular Ecology Resources Database 1 February 2013-31 March 2013

    Get PDF
    This article documents the addition of 142 microsatellite marker loci to the Molecular Ecology Resources database. Loci were developed for the following species: Agriophyllum squarrosum, Amazilia cyanocephala, Batillaria attramentaria, Fungal strain CTeY1 (Ascomycota), Gadopsis marmoratus, Juniperus phoenicea subsp. turbinata, Liriomyza sativae, Lupinus polyphyllus, Metschnikowia reukaufii, Puccinia striiformis and Xylocopa grisescens. These loci were cross-tested on the following species: Amazilia beryllina, Amazilia candida, Amazilia rutila, Amazilia tzacatl, Amazilia violiceps, Amazilia yucatanensis, Campylopterus curvipennis, Cynanthus sordidus, Hylocharis leucotis, Juniperus brevifolia, Juniperus cedrus, Juniperus osteosperma, Juniperus oxycedrus, Juniperus thurifera, Liriomyza bryoniae, Liriomyza chinensis, Liriomyza huidobrensis and Liriomyza trifolii. © 2013 John Wiley & Sons Ltd.Peer Reviewe

    Antimicrobial resistance among migrants in Europe: a systematic review and meta-analysis

    Get PDF
    BACKGROUND: Rates of antimicrobial resistance (AMR) are rising globally and there is concern that increased migration is contributing to the burden of antibiotic resistance in Europe. However, the effect of migration on the burden of AMR in Europe has not yet been comprehensively examined. Therefore, we did a systematic review and meta-analysis to identify and synthesise data for AMR carriage or infection in migrants to Europe to examine differences in patterns of AMR across migrant groups and in different settings. METHODS: For this systematic review and meta-analysis, we searched MEDLINE, Embase, PubMed, and Scopus with no language restrictions from Jan 1, 2000, to Jan 18, 2017, for primary data from observational studies reporting antibacterial resistance in common bacterial pathogens among migrants to 21 European Union-15 and European Economic Area countries. To be eligible for inclusion, studies had to report data on carriage or infection with laboratory-confirmed antibiotic-resistant organisms in migrant populations. We extracted data from eligible studies and assessed quality using piloted, standardised forms. We did not examine drug resistance in tuberculosis and excluded articles solely reporting on this parameter. We also excluded articles in which migrant status was determined by ethnicity, country of birth of participants' parents, or was not defined, and articles in which data were not disaggregated by migrant status. Outcomes were carriage of or infection with antibiotic-resistant organisms. We used random-effects models to calculate the pooled prevalence of each outcome. The study protocol is registered with PROSPERO, number CRD42016043681. FINDINGS: We identified 2274 articles, of which 23 observational studies reporting on antibiotic resistance in 2319 migrants were included. The pooled prevalence of any AMR carriage or AMR infection in migrants was 25·4% (95% CI 19·1-31·8; I2 =98%), including meticillin-resistant Staphylococcus aureus (7·8%, 4·8-10·7; I2 =92%) and antibiotic-resistant Gram-negative bacteria (27·2%, 17·6-36·8; I2 =94%). The pooled prevalence of any AMR carriage or infection was higher in refugees and asylum seekers (33·0%, 18·3-47·6; I2 =98%) than in other migrant groups (6·6%, 1·8-11·3; I2 =92%). The pooled prevalence of antibiotic-resistant organisms was slightly higher in high-migrant community settings (33·1%, 11·1-55·1; I2 =96%) than in migrants in hospitals (24·3%, 16·1-32·6; I2 =98%). We did not find evidence of high rates of transmission of AMR from migrant to host populations. INTERPRETATION: Migrants are exposed to conditions favouring the emergence of drug resistance during transit and in host countries in Europe. Increased antibiotic resistance among refugees and asylum seekers and in high-migrant community settings (such as refugee camps and detention facilities) highlights the need for improved living conditions, access to health care, and initiatives to facilitate detection of and appropriate high-quality treatment for antibiotic-resistant infections during transit and in host countries. Protocols for the prevention and control of infection and for antibiotic surveillance need to be integrated in all aspects of health care, which should be accessible for all migrant groups, and should target determinants of AMR before, during, and after migration. FUNDING: UK National Institute for Health Research Imperial Biomedical Research Centre, Imperial College Healthcare Charity, the Wellcome Trust, and UK National Institute for Health Research Health Protection Research Unit in Healthcare-associated Infections and Antimictobial Resistance at Imperial College London
    corecore